61 research outputs found

    OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering.</p> <p>Results</p> <p>OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. </p> <p>Conclusion</p> <p>OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at <url>http://bionlp.sourceforge.net/</url></p

    Iron Behaving Badly: Inappropriate Iron Chelation as a Major Contributor to the Aetiology of Vascular and Other Progressive Inflammatory and Degenerative Diseases

    Get PDF
    The production of peroxide and superoxide is an inevitable consequence of aerobic metabolism, and while these particular "reactive oxygen species" (ROSs) can exhibit a number of biological effects, they are not of themselves excessively reactive and thus they are not especially damaging at physiological concentrations. However, their reactions with poorly liganded iron species can lead to the catalytic production of the very reactive and dangerous hydroxyl radical, which is exceptionally damaging, and a major cause of chronic inflammation. We review the considerable and wide-ranging evidence for the involvement of this combination of (su)peroxide and poorly liganded iron in a large number of physiological and indeed pathological processes and inflammatory disorders, especially those involving the progressive degradation of cellular and organismal performance. These diseases share a great many similarities and thus might be considered to have a common cause (i.e. iron-catalysed free radical and especially hydroxyl radical generation). The studies reviewed include those focused on a series of cardiovascular, metabolic and neurological diseases, where iron can be found at the sites of plaques and lesions, as well as studies showing the significance of iron to aging and longevity. The effective chelation of iron by natural or synthetic ligands is thus of major physiological (and potentially therapeutic) importance. As systems properties, we need to recognise that physiological observables have multiple molecular causes, and studying them in isolation leads to inconsistent patterns of apparent causality when it is the simultaneous combination of multiple factors that is responsible. This explains, for instance, the decidedly mixed effects of antioxidants that have been observed, etc...Comment: 159 pages, including 9 Figs and 2184 reference

    Microbiota and chronic inflammatory arthritis: an interwoven link

    Full text link

    Minimal residual disease in breast cancer: an overview of circulating and disseminated tumour cells

    Full text link

    An annotated corpus with nanomedicine and pharmacokinetic parameters

    No full text
    Nastassja A Lewinski,1 Ivan Jimenez,1 Bridget T McInnes2 1Department of Chemical and Life Science Engineering, Virginia Commonwealth University, Richmond, VA, 2Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA Abstract: A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP) approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP) efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration&rsquo;s Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided. Keywords: nanotechnology, informatics, natural language processing, text mining, corpor
    corecore